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CRYSTALLIZABLE COMPOSniONS COMPRISING A HEPATITIS C VIRUS NS3 PROTEASE 
DOMAIN/NS4A COMPLEX AND CRYSTALS THEREBY OBTAINED 

5 TECHNICAL FIELD OF INVENTION 

The present invention relates to compositions 
and crystals of a hepatitis C virus protease in complex 
with its viral cof actor. This invention also relates 
10 to methods of using the structure coordinates of 

hepatitis C virus protease in complex with a synthetic 
NS4A to solve the structure of similar or homologous 
proteins or protein complexes. 

15 BACKGROUN D OF THE INVENTION 

Infection by hepatitis C virus (HCV) is a 
compelling human medical problem. HCV is recognized as 
the causative agent for most cases of non-A, non-B 

20 hepatitis, with an estimated human seroprevalence of 1% 
globally [Choo, Q.-L. et al., "Isolation of a cDNA 
Clone Derived From a Blood-Borne Non-A, Non-B Viral 
Hepatitis Genome", Science , 244, pp. 359-362 (1989); 
Kuo, G. et al., "An Assay for Circulating Antibodies to 

25 a Major Etiologic Virus of Human Non-A, Non-B 

Hepatitis", Science , 244, pp. 362-364 (1989); Purcell, 
R.H., "Hepatitis C virus: Historical perspective and 
current concepts", FEMS Microbiology Reviews , 14, pp. 
181-192 (1994); Van der Poel, C.L., "Hepatitis C Virus. 

30 . Epidemiology, Transmission and Prevention in Hepatitis 
C virus. Current Studies in Hematology and Blood 
Transfusion, H.W. Reesink, Ed*, (Basel: Karger) , pp. 
137-163 (1994)]. Four million individuals may be 
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infected in the United States alone [Alter, M.J. and 
Mast, E.E., "The Epidemiology of Viral Hepatitis in 

the United States, Gastroenterol. Clin. North Am. , 23, 
FP- 437-455 (1994) ] . 

5 Upon first exposure to HCV only about 20% of 

infected individuals develop acute clinical hepatitis 
while others appear to resolve the infection 
spontaneously. in most instances, however, the virus 
establishes a chronic infection that persists for 
10 decades [Iwarson, S. "The Natural Course of Chronic 

Hepatitis", FEMS Microbiology Reviews . 14, pp. 201-204 
(1994)]. This usually results in recurrent and 
progressively worsening liver inflammation, which often 
leads to more severe disease states such as cirrhosis 
and hepatocellular carcinoma [Kew, M.C., "Hepatitis C 
and Hepatocellular Carcinoma", FEMS Microbiology 
Reviews/ 14, pp. 211-220 (1994); Saito, I., et al. 
"Hepatitis C Virus Infection is Associated with the 
Development cf Hepatocellular Carcinoma", Proc. Natl. 
Acad. Sci. USA 87, pp. 6547-6549 (1990)]. Currently, 
there are no broadly effective treatments for the 
debilitating progression of chronic HCV. 

The HCV genome encodes a polyprotein of 3010- 
3033 amino acids (Figure 1) [Choo, Q.-L., et al. 
"Genetic Organization and Diversity of the Hepatitis C 
Virus", Proc. Natl. Acad. Sci. USA . 88, pp. 2451-2455 
(1991); Kato, N. et al . , Molecular Cloning of the Human 
Hepatitis C Virus Genome From Japanese Patients with 
Non-A, Non-B Hepatitis", Proc. Natl. Acad. Sci. USA . 
50 87, pp . 9524-9528 (1990); Takamizawa, A. et al . , 

"Structure and Organization of the Hepatitis c Virus 
Genome Isolated From Human Carriers", J. Virol. , 65, 
pp. 1105-1113 (1991)]. The HCV nonstructural (NS) 
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proteins provide catalytic machinery for viral 
replication. The NS proteins are derived by 
proteolytic cleavage of the polyprotein 
[Bartenschlager, R. et al., "Nonstructural Protein 3 of 
the Hepatitis C Virus Encodes a Serine-Type Proteinase 
Required for Cleavage at the NS3/4 and NS4/5 
Junctions", J. Virol. , 67, pp. 3835-3844 (1993); 
Grakoui, A. et al. "Characterization of the Hepatitis C 
Virus-Encoded Serine Proteinase: Determination of 
Proteinase-Dependent Polyprotein Cleavage Sites", J. 
Viro1 -' 67 ' PP- 2832-2843 (1993); Grakoui, A. et al7, 
Expression and Identification of Hepatitis C Virus 
Polyprotein Cleavage Products", J. Virol. , 67, pp. 
1385-1395 (1993); Tomei, L. et al., "NS3 is a serine 
protease required for processing of hepatitis C virus 
polyprotein", J. Virol. , 67, pp. 4017-4026 (1993)]. 

The HCV NS protein 3 (NS3) contains a serine 
protease activity that helps process the majority of 
the viral enzymes, and is thus considered essential for 
20 viral replication and infectivity. It is known that 
mutations m the yellow fever virus NS3 protease 
decreases viral infectivity [Chambers, T.J. et . al., 
"Evidence that the N-termnal Domain of Nonstructural 
Protein NS3 From Yellow Fever Virus is a Serine 
25 Protease Responsible for Site-Specific Cleavages in the 
Viral Polyprotein", Pr oc. Natl. Acad. Sci. USA . 87, pp. 
8898-8902 (1990)]. The first 181 amino acids of NS3 
(residues 1027-1207 of the viral polyprotein) have been 
shown to contain the serine protease domain of NS3 that 
processes all four downstream sites of the HCV 
polyprotein (Figure 1) [c. Lin et al., "Hepatitis c 
Virus NS3 Serine Proteinase: Trans-Cleavage 
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Requirements and Processing Kinetics", J. Virol. , 68, 
pp. 8147-8157 (1994) ] . 

NS3 is associated with a cofactor, NS4A. 
NS4A seems critical to the activity of NS3, enhancing 
the proteolytic efficiency of NS3 at all of the 
cleavage sites. NS4A is a 54 residue amphipathic 
peptide, with a hydrophobic N-terminus and a 
hydrophilic C- terminus [Failla, C. et al., "Both NS3 
and NS4A are Required for Proteolytic Processing of 
Hepatitis C Virus Nonstructural Proteins"/ J. Virol. , 
68, pp. 3753-3760 (1994)]. its function appears 
complex, possibly assisting in the membrane- 
localization of NS3 and other viral replicase 
components [Lin, C. et al. "A Central Region in the 
15 Hepatitis C Virus NS4A Protein Allows Formation of an 
Active NS3-NS4A Serine Proteinase Complex In Vivo and 
In Vitro", J. Virol. , 69, pp. 4373-4380 (1995b); 
Shimizu, Y. et al., "Identification cf the Sequence on 
NS4A Required for Enhanced Cleavage of the NS5A/5B Site 
by Hepatitis C Virus NS3 Protease", J. Virol. , 70, pp. 
127-132 U996); Tanji, Y. et al., "Hepatitis C Virus- 
Encoded Nonstructural Protein NS4A has Versatile 
Functions in Viral Protein Processing", J. Virol. , 69, 
pp. 1575-1581 (1995)] but its best characterized 
function is that of a cofactor for the NS3 protease. 

The current understanding of HCV has not led 
to satisfactory treatments for HCV infection. The 
prospects for effective anti-HCV vaccines remain 
uncertain. The only established therapy for HCV 
disease is interferon treatment. However, interferons 
have significant side effects [Janssen, H. L. A., et 
al. "Suicide Associated with Alfa-Interferon Therapy 
for Chronic Viral Hepatitis", J. Hepatol. , 21, pp. 241- 
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243 (1994)]; Renault, P.F. and Hoofnagle, J.H., "Side 
effects of alpha interferon. Seminars in Liver Disease 
9, 273-277. (1989)] and induce long term remission in 
only a fraction (~ 25%) of cases [Weiland, 0. 
"Interferon Therapy in Chronic Hepatitis C Virus 
Infection", FEMS Microbiol. Rev. . 14, pp. 279-288 
(1994)]. Thus, there is a need for more effective 
anti-HCV therapies. 

The NS3 protease is considered a potential 
target for antiviral agents. However, drug discovery 
efforts directed towards the NS3 protein have been 
hampered by the lack of structural information about 
NS3 and its complex with NS4A. Such structural 
information would provide valuable information in 
15 discovery of HCV N33 protease inhibitors. However, 

efforts to determine the structure of HCV NS3 protease 
have been hampered by difficulties in obtaining 
sufficient quantities of pure active enzyme 
[Steinkuhler, C. et al., "In Vitro Activity of 
Hepatitis C Virus Protease N33 Purified from 
Recombinant Baculovirus-Inf ected Sf9 Cells", J. Biol 
Chenu, pp. 637-6273 (1996)]. There have been no 
crystals reported of any NS3 or NS3 protease domain 
protein. Thus, x-ray crystailographic analysis of such 
25 proteins has not been possible. 

SUMMARY OF THE INVENTION 
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Applicants have solved this problem by 
providing, for the first time, compositions comprising 
a hepatitis C virus (HCV) NS3 protease-like polypeptide 
complexed with a NS4A-like peptide and methods for 
making such compositions. 
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The invention also provides crystals of a HCV 
NS3 protease-like polypeptide/NS4A-like peptide complex 
and methods for making such crystals. 

The invention also provides the structure 
coordinates of a HCV NS3 protease-like 
polypeptide/NS4A-like peptide complex. 

The invention also provides a method for 
determining at least a portion of the three-dimensional 
structure of molecules or molecular complexes which 
contain at least some structurally similar features to 
a HCV NS3 serine protease domain. 

BRIEF DESCRIP TIO N OF THE FIGURES 

Figure 1 depicts HCV polyprotein processing. 
The locations of the HCV structural and nonstructural 
proteins are marked on a diagram of the 3011 amino acid 
polypeptide. Cleavages between the structural proteins 
by cellular signal peptidases are marked by asterisks. 
Cleavage between NS2 and NS3 is mediated by the NS2/NS3 
metallo-protease. The NS3 serine protease is 
responsible for cleavages between NS3 and NS4A, NS4A 
and NS4B, NS4B and NS5A, and NS5A and NS5B. 

Figure 2 depicts stereo ribbon diagrams of 
the NS3/NS4A complex. The view is into the active site 
cleft of the enzyme. Side-chains of active site 
residues His-1083, Asp-1107, and Ser-1165, along with 
Zn + + ligands Cys-1123, Cys-1125, and Cys-1171 are 
displayed in ball-and-stick representation. Zn ++ , its 
H2O ligand, and the (J-strand formed by NS4A are also 
shown . 
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Figure 3 lists the atomic structure 
coordinates for hepatitis C virus recombinant, 
truncated nonstructural protein 3 (hereafter referred 
to as tNS3) in complex with a synthetic peptide of the 
central region of the nonstructural protein 4A 
(hereafter referred to as sNS4A) as derived by X-ray 
diffraction from crystals of that complex (hereafter 
referred to as tNS3/sNS4A) . The preparation of the 
complex is described in Examples 1 and 2. The 
following abbreviations are used in Figure 3: 

"Atom type" refers to the element whose 
coordinates have been determined. Elements are defined 
by the first letter in the column except for zinc which 
is defined by the letters "Zn". 

"X, Y, Z" crystallographically define the 
atomic position determined for each atom. 

W B" is a thermal factor that measures 
movement of the atom around its atomic center. 

"Occ" is an occupancy factor that refers to 
the fraction of the molecules in which each atom 
occupies the position specified by the coordinates. A 
value of "1" indicates that each atom has the same 
conformation, i.e., the same position, in ail molecules 
of the crystal. 

Figure 4 shows a diagram of a system used to 
carry out the instructions encoded by the storage 
medium of Figures 5 and 6. 

Figure 5 shows a cross section of a magnetic 
storage medium. 

Figure 6 shows a cross section or a 
optically-readable data storage medium. 
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DETAILED DESCRIPTION OF THE INVENTION 
The following abbreviations are used 
throughout the application: 



A 




Ala = 


Alanine 


T 


- 


Thr = 


Threonine 


V 




Val = 


Valine 


C 




Cys = 


Cysteine 


L 




Leu - 


Leucine 


Y 




Tyr = 


Tyrosine 


I 




He = 


Isoleucine 


N 




Asn =■ 


Asparagine 


P 




Pro = 


Proline 


Q 




Gin « 


Glutamine 


F 




Phe « 


Phenylalanine 


D 




Asp = 


Aspartic Acid 


W 




Trp = 


Tryptophan 


E 




Giu = 


Glutamic Acid 


M 




Met = 


Methionine 


K 




Lys - 


Lysine 


G 




Gly = 


Glycine 


R 




Arg = 


Arginine 


S 




Ser = 


Serine 


H 




His = 


Histidine 



HCV = hepatitis C virus 



Additional definitions are set forth in the 
specification where necessary. 

In order that the invention described herein 
may be more fully understood, the following detailed 
description is set forth. 

Applicants have solved the above problems by 
providing, for the first time, crystallizable 
compositions comprising a HCV NS3 protease-like 
polypeptide in complex with a NS4A-like peptide. 

Thus, in one embodiment of this invention is 
provided a composition comprising a hepatitis C virus 
NS3-like polypeptide in complex with an NS4A-like 
peptide. 

The HCV NS3-like polypeptide portion of the 
complex is any polypeptide which has the serine 
protease activity of the naturally occurring HCV NS3A 
protease, particularly the ability to cleave the HCV 
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polyprotein. It includes HCV NS3, NS3 protease domain 
polypeptides and NS3 protease domain-like polypeptides. 

As used herein, the terms "HCV NS3" and "NS3" 
refers to the hepatitis C virus nonstructural-3 protein 
as defined in Lin, C. et al., "Hepatitis C Virus NS3 
Serine Proteinase: Trans-Cleavage Requirements and 
Processing Kinetics", J. Virol. , 68, pp. 8147-8157 
(1994) . 

The term "NS3 protease domain polypeptide" 
refers to a truncated, serine protease portion of NS3 
as defined in [Bartenschlager, R. et al., 
"Nonstructural Protein 3 of the Hepatitis C Virus 
Encodes a Serine-Type Proteinase Required for Cleavage 
at the NS3/4 and NS4/5 Junctions", J. Virol. , 67, pp. 
15 3835-3844 (1993); Grakoui, A. et al. "Characterization 
of the Hepatitis C Virus-Encoded Serine Proteinase: 
Determination of Proteinase-Dependent Polyprotein 
Cleavage Sites", J. Virol. , 67, pp. 2832-2843 (1993); 
Grakoui, A. et al., Expression and Identification of 
Hepatitis C Virus Polyprotein Cleavage Products", J. 
ViroJU, 67, pp. 1385-1395 (1993); Tomei, L. et al . , 
"NS3 is a serine protease required for processing of 
hepatitis C virus polyprotein", J. Virol. , 67, pp. 
4017-4026 (1993)]. The disclosure cf each of these 
25 documents is herein incorporated by reference. 

The term "NS3 protease domain-like 
polypeptides" refers to polypeptides that differ from 
NS3 protease domain polypeptides by having amino acid 
deletions, substitutions, and additions, but which 
retain the serine protease activity of NS3. 

Preferably, the NS3-like polypeptide in the 
compositions of this invention is tNS3, a recombinantly 
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produced hepatitis C virus protease domain protein that 
is prepared as described herein. 

The NS4A-like peptide portion of the 
compositions of this invention is any peptide or 
5 peptide mimetic that is capable of acting as a NS4A 
cof actor for the NS3. These include NS4A, peptide 
fragments thereof and other peptides that differ from 
NS4A by having amino acid deletions, substitutions, and 
additions, while retaining the above-described 
10 activity. 

As used herein the term "NS4A" refers to the 
hepatitis C virus nonstructural protein 4A which acts 
as a cof actor for NS3 protease [Failla, C. et al., 
"Both NS3 and NS4A are Required for Proteolytic 
15 Processing of Hepatitis C Virus Nonstructural Proteins" 
J - v irol. 68, pp. 3753-3760 (1994); Lin, C. et al., 
"Hepatitis C Virus NS3 Serine Proteinase: Trans- 
Cleavage Requirements and Processing Kinetics" J. 
Virol . 68, pp. 8147-8157 (1994b)] 
20 Preferably, the NS4A-like peptide is sNS4A, 

the synthetic peptide H-KKGSWIVGRIVLSGKPAIIPKK-OH. 
This peptide encompasses the essential NS3 protease 
domain residues of NS4A. 

Both the NS3-like polypeptide and the NS4A- 
25 like peptide may be produced by any well-known method, 
including synthetic methods, such as solid phase, 
liquid phase and combination solid phase/liquid phase 
syntheses; recombinant DNA methods, including cDNA 
cloning, optionally combined with site directed 
30 mutagenesis; and/or purification of the natural 

products, optionally combined with enzymatic cleavage 
methods to produce fragments of naturally occurring NS3 
and NS4A. 
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According to a preferred embodiment, the 
compositions of this invention are crystallizable. In 
this preferred embodiment all of the preferred choices 
for the NS3-like polypeptide and the NS4A-like peptide 
5 are identical to those indicated above. 

Advantageously, the crystallizable 
composition provided by this invention are amenable to 
x-ray crystallography. Thus, this invention also 
provides the three-dimensional structure of an HCV NS3- 
10 like polypeptide/NS4A-like peptide complex, 

specifically an HCV tNS3/sNS4A complex, at 2.5 A 
resolution. Importantly, this has provided for the 
first time, information about the shape and structure 
of the NS3 protease domain. 
15 Tn e three-dimensional structure of the HCV 

tNS3/sNS4A complex of this invention is defined by a 
set of structure coordinates as set forth in Figure 3. 
The term "structure coordinates" refers to Cartesian 
coordinates derived from mathematical equations related 
20 to the patterns obtained on diffraction of a 

monochromatic beam of X-rays by the atoms (scattering 
centers) of an tNS3/sNS4A complex in crystal form. The 
diffraction data are used to calculate an electron 
density map of the repeating unit of the crystal. The 
25 electron density maps are then used to establish the 
positions of the individual atoms of the tNS3/sNS4A 
enzyme or enzyme complex. 

Those of skill in the art will understand 
that a set of structure coordinates for an enzyme or an 
enzyme-complex or a portion thereof, is a relative set 
of points that define a shape in three dimensions. 
Thus, it is possible that an entirely different set of 
coordinates could define a similar or identical shape. 
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Moreover, slight variations in the individual 
coordinates will have little effect on overall shape. 

The variations in coordinates discussed above 
may be generated because of mathematical manipulations 
of the structure coordinates. For example, the 
structure coordinates set forth in Figure 3 could be 
manipulated by crystallographic permutations of the 
structure coordinates, f ractionalization of the 
structure coordinates, integer additions or 
subtractions to sets of the structure coordinates, 
inversion of the structure coordinates or any 
combination of the above. 

Alternatively, modifications in the crystal 
structure due to mutations, additions, substitutions, 
and/or deletions of amino acids, or other changes in 
any of the components that make up the crystal could 
also account for variations in structure coordinates. 
If such variations are within an acceptable standard 
error as compared to the original coordinates, the 
resulting three-dimensional shape is considered to be 
the same. 

Various computational analyses are therefore 
necessary to determine whether a molecule or molecular 
complex or a portion thereof is sufficiently similar to 
all or parts of the NS3-like polypeptide/NS4A-like 
peptide structure described above as to be considered 
the same. Such analyses may be carried out in current 
software applications, such as the Molecular Similarity 
application of QUANTA (Molecular Simulations Inc., San 
Diego, CA) version 4.1, and as described in the 
accompanying User's Guide. 

The Molecular Similarity application permits 
comparisons between different structures, different 
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conformations of the same structure, and different 
parts of the same structure. The procedure used in 
Molecular Similarity to compare structures is divided 
into four steps: 1) load the structures to be 
compared; 2) define the atom equivalences in these 
structures; 3) perform a fitting operation; and 4) 
analyze the results. 

Each structure is identified by a name. One 
structure is identified as the target (i.e., the fixed 
structure); all remaining structures are working 
structures (i.e., moving structures). Since atom 
equivalency within QUANTA is defined by user input, for 
the purpose of this invention we will define equivalent 
atoms as protein backbone atoms (N, Ca, C and 0) for 
all conserved residues between the two structures being 
compared. We will also consider only rigid fitting 
operations. 

When a rigid fitting method is used, the 
working structure is translated and rotated to obtain 
an optimum fit with the target structure. The fitting 
operation uses an algorithm that computes the optimum 
translation and rotation to be applied to the moving 
structure, such that the root mean square difference of 
the fit over the specified pairs of equivalent atom is 
an absolute minimum. This number, given in angstroms, 
is reported by QUANTA. 

For the purpose of this invention, any 
molecule or molecular complex that has a root mean 
square deviation of conserved residue backbone atoms 
(N, Ca, C, O) of less than 1.5 A when superimposed on 
the relevant backbone atoms described by structure 
coordinates listed in Figure 3 are considered 
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identical. More preferably, the root mean square 
deviation is less than 1.0 A. 

The term "root mean square deviation" means 
the square root of the arithmetic mean of the squares 
of the deviations from the mean. It is a way to 
express the deviation or variation from a trend or 
object. For purposes of this invention, the "root mean 
square deviation" defines the variation in the backbone 
of a protein or protein complex from the relevant 
portion of the backbone of the NS3-like polypeptide 
portion of the complex as defined by the structure 
coordinates described herein. 

Once the structure coordinates of a protein 
crystal have been determined they are useful in solving 
15 the structures of other crystals. 

Thus, in accordance with the present 
invention, the structure coordinates of a NS3-like 
polypeptide/NS4A-like peptide complex, and in 
particular a tNS3/sNS4A complex, and portions thereof 
is stored in a machine-readable storage medium. Such 
data may be used for a variety of purposes, such as 
drug discovery and x-ray crystallographic analysis or 
protein crystal. 

Accordingly, in one embodiment of this 
25 invention is provided a machine-readable data storage 
medium comprising a data storage material encoded with 
the structure coordinates set forth in Figure 3. 

Figure 4 demonstrates one version of these 
embodiments. System 10 includes a computer 11 
comprising a central processing unit {"CPU") 20, a 
working memory 22 which may be, e.g, RAM (random-access 
memory) or "core" memory, mass storage memory 24 (such 
as one or more disk drives or CD-ROM drives), one or 
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more cathode-ray tube ("CRT") display terminals 2 6, one 
or more keyboards 28, one or more input lines 30, and 
one or more output lines 40, all of which are 
interconnected by a conventional bidirectional system 
5 bus 50. 

Input hardware 36, coupled to computer 11 by 
input lines 30, may be implemented in a variety of 
ways. Machine-readable data of this invention may be 
inputted via the use of a modem or modems 32 connected 

0 by a telephone line or dedicated data line 34. 

Alternatively or additionally, the input hardware 36 
may comprise CD-ROM drives or disk drives 24. In 
conjunction with display terminal 26, keyboard 28 may 
also be used as an input device. 

5 Output hardware 46, coupled to computer 11 by 

output lines 40, may similarly be implemented by 
conventional devices. By way of example, output 
hardware 4 6 may include CRT display terminal 2 6 for 
displaying a graphical representation of a binding 

3 pocket of this invention using a program such as QUANTA 
as described herein. Output hardware might also 
include a printer 42, so that hard copy output may be 
produced, or a disk drive 24, to store system output 
for later use. 

» In operation, CPU 20 coordinates the use of 

the various input and output devices 36, 46, 
coordinates data accesses from mass storage 24 and 
accesses to and from working memory 22, and determines 
the sequence of data processing steps. A number of 

> programs may be used to process the machine-readable 

data of this invention. Such programs are discussed in 
reference to the computational methods of drug 
discovery as described herein. Specific references to 
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components of the hardware system 10 are included as 
appropriate throughout the following description of the 
data storage medium. 

Figure 5 shows a cross section of a magnetic 
data storage medium 100 which can be encoded with a 
machine-readable data that can be carried out by a 
system such as system 10 of Figure 4. Medium 100 can 
be a conventional floppy diskette or hard disk, having 
a suitable substrate 101, which may be conventional, 
and a suitable coating 102, which may be conventional, 
on one or both sides, containing magnetic domains (not 
visible) whose polarity or orientation can be altered 
magnetically. Medium 100 may also have an opening (not 
shown) for receiving the spindle of a disk drive or 
15 other data storage device 24. 

The magnetic domains of coating 102 of medium 
100 are polarized or oriented so as to encode in manner 
which may be conventional, machine readable data such 
as that: described herein, for execution by a system 
20 such as system 10 of Figure 4. 

Figure 6 shows a cross section of an 
optically-readable data storage medium 110 which also 
can be encoded with such a machine-readable data, or 
set of instructions, which can be carried out by a 
25 system such as system 10 of Figure 4. Medium 110 can 
be a conventional compact disk read only memory 
(CD-ROM) or a rewritable medium such as a 
magneto-optical disk which is optically readable and 
magneto-optically writable. Medium 100 preferably has 
a suitable substrate 111, which may be conventional, 
and a suitable coating 112, which may be conventional, 
usually of one side of substrate 111. 
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In the case of CD-ROM, as is well known, 
coating 112 is reflective and is impressed with a 
plurality of pits 113 to encode the machine-readable 
data. The arrangement of pits is read by reflecting 
laser light off the surface of coating 112. A 
protective coating 114, which preferably is 
substantially transparent, is provided on top of 
coating 112. 

In the. case of a magneto-optical disk, as is 
well known, coating 112 has no pits 113, but has a 
plurality of magnetic domains whose polarity or 
orientation can be changed magnetically when heated 
above a certain temperature, as by a laser (not shown) . 
The orientation of the domains can be read by measuring 
the polarization of laser light reflected from coating 
112. The arrangement of the domains encodes the data 
as described above. 

For the first time, the present invention 
permits the use of structure-based or rational drug 
design techniques to design, select, and synthesize 
chemical entities, including inhibitory compounds that 
are capable of binding to KCV NS3, NS4A, NS3/NS4A 
complex, or any portion thereof. 

One particularly useful drug design technique 
enabled by this invention is iterative drug design. 
Iterative drug design is a method for optimizing 
associations between a protein and a compound by 
determining and evaluating the three-dimensional 
structures of successive sets of protein/compound 
30 complexes. 

Those of skill in the art will realize that 
association of natural ligands or substrates with the 
binding pockets of their corresponding receptors or 
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enzymes is the basis of many biological mechanisms of 
action. The term "binding pocket", as used herein, 
refers to a region of a molecule or molecular complex, 
that, as a result of its shape, favorably associates 
with another chemical entity or compound. Similarly, 
many drugs exert their biological effects through 
association with the binding pockets of receptors and 
enzymes. Such associations may occur with all or any 
parts of the binding pockets. An understanding of such 
associations will help lead to the design of drugs 
having more favorable associations with their target 
receptor or enzyme, and thus, improved biological 
effects. Therefore, this information is valuable in 
designing potential ligands or inhibitors of receptors 
15 or enzymes, such as inhibitors of HCV NS3-like 
polypeptides, and more importantly HCV NS3. 

The term "associating with" refers to a 
condition of proximity between chemical entities or 
compounds, or portions thereof. The association may be 
non-covalent — wherein the juxtaposition is 
energetically favored by hydrogen bonding or van der 
Waals or electrostatic interactions — or it may be 
covalent . 

In iterative drug design, crystals of a 
25 series of protein/compound complexes are obtained and 
then the three-dimensional structures of each complex 
is solved. Such an approach provides insight into \he 
association between the proteins and compounds of each 
complex. This is accomplished by selecting compounds 
with inhibitory activity, obtaining crystals of this 
new protein/compound complex, solving the three- 
dimensional structure of the complex, and comparing the 
associations between the new protein/compound complex 
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and previously solved protein/ compound complexes. By 
observing how changes in the compound affected the 
protein/compound associations, these associations may 
be optimized. 

5 In some cases, iterative drug design is 

carried out by forming successive protein-compound 
complexes and then crystallizing each new complex. 
Alternatively, a pre-formed protein crystal is soaked 
in the presence of an inhibitor, thereby forming a 

10 protein/ compound complex and obviating the need to 

crystallize each individual protein/compound complex. 
Advantageously, the HCV NS3-like polypeptide/NS4A-like 
peptide crystals, and in particular the tNS3/sNS4A 
crystals, provided by this invention may be soaked in 

15 the presence of a compound or compounds, such as NS3 
protease inhibitors, to provide NS3-like 
polypeptide/NS4A-like peptide /compound crystal 
complexes. 

As used herein, the term "soaked" refers to a 
process in which the crystal is transferred to a 
solution containing the compound of interest. 

In another embodiment of this invention is 
provided a method for preparing a composition 
comprising a NS3-like polypeptide protein comprising 
25 the steps described in Examples 1 and 2. Preferably, 
the composition comprises a NS3-like polypeptide in 
complex with a NS4A-like peptide. 

The structure coordinates set forth in Figure 
3 can also be used to aid in obtaining structural 
information about another crystallized molecule or 
molecular complex. This may be achieved by any of a 
number of well-known techniques, including molecular 
replacement . 
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The structure coordinates set forth in Figure 
3 can also be used for determining at least a portion 
of the three-dimensional structure of molecules or 
molecular complexes which contain at least some 
5 structurally similar features to HCV NS3. In 

particular, structural information about another 
crystallized molecule or molecular complex may be 
obtained. This may be achieved by any of a number of 
well-known techniques, including molecular replacement. 
10 Therefore, in another embodiment this 

invention provides a method of utilizing molecular 
replacement to obtain structural information about a 
crystallized molecule or molecular complex whose 
structure is unknown comprising the steps of: 

15 a) generating an X-ray diffraction pattern 

from said crystallized molecule or molecular complex; 
and 

b) applying at least a portion of the 
structure coordinates set forth in Figure 3 to the 

20 X-ray diffraction pattern to generate a 

three-dimensional electron density map of the molecule 
or molecular complex whose structure is unknown. 

Preferably, the crystallized molecule or 
molecular complex comprises a NS3-like polypeptide and 

25 a NS4A-like peptide. More preferably, the crystallized 
molecule or molecular complex is obtained by soaking a 
crystal of this invention in a solution. 

By using molecular replacement, all or part 
of the structure coordinates of the tNS3/sNS4A complex 

30 provided by this invention (and set forth in Figure 3) 
can be used to determine the structure of a 
crystallized molecule or molecular complex whose 
structure is unknown more quickly and efficiently than 
attempting to determine such information ab initio. 



WO 98/11134 



PCT/US97/16182 



- 21 - 

Molecular replacement provides an accurate 
estimation of the phases for an unknown structure. 
Phases are a factor in equations used to solve crystal 
structures that can not be determined directly. 
5 Obtaining accurate values for the phases, by methods 
other than molecular replacement, is a time-consuming 
process that involves iterative cycles of 
approximations and refinements and greatly hinders the 
solution of crystal structures. However, when the 

10 crystal structure of a protein containing at least a 

homologous portion has been solved, the phases from the 
known structure provide a satisfactory estimate of the 
phases for the unknown structure. 

Thus, this method involves generating a 

15 preliminary model of a molecule or molecular complex 
whose structure coordinates are unknown, by orienting 
and positioning the relevant portion of the tNS3/sNS4A 
complex according to Figure 3 within the unit cell of 
the crystal of the unknown molecule or molecular 

20 complex so as best to account for the observed X-ray 
diffraction pattern of the crystal of the molecule or 
molecular complex whose structure is unknown. Phases 
can then be calculated from this model and combined 
with the observed X-ray diffraction pattern amplitudes 

25 to generate an electron density map of the structure 
whose coordinates are unknown. This, in turn, can be 
subjected to any well-known model building and 
structure refinement techniques to provide a final, 
accurate structure of the unknown crystallized molecule 

30 or molecular complex [E. Lattman, "Use of the Rotation 
and Translation Functions", in Meth. Enzymol ., 115, pp. 
55-77 (1985); M. G. Rossmann, ed., "The Molecular 
Replacement Method", Int. Sci. Rev. Ser. , No. 13, 
Gordon & Breach, New York (1972)]. 
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The structure of any portion of any 
crystallized molecule or molecular complex that is 
sufficiently homologous to any portion of the 
tNS3/sNS4A complex can be solved by this method. 
5 In a preferred embodiment, the method of 

molecular replacement is utilized to obtain structural 
information about a molecule or molecular complex, 
wherein the complex comprises a NS3-like polypeptide. 
Preferably the NS3-like polypeptide is tNS3 or 
10 homologue thereof. 

The structure coordinates of tNS3/sNS4A as 
provided by this invention are particularly useful in 
solving the structure of other crystal forms of NS3- 
like polypeptide, preferably other crystal forms of 
15 tNS3; NS3-like polypeptide/NS4A-like peptide, 

preferably tNS3/sNS4A; or complexes comprising any of 
the above. 

The structure coordinates are also 
particularly useful to solve the structure of crystals 
of NS3-like polypeptide/NS4A-like peptide complexes, 
particularly tNS3/sNS4A, co-complexed with a variety of 
chemical entities. This approach enables the 
determination of the optimal sites for interaction 
between chemical entities, including interaction of 
25 candidate NS3 inhibitors with NS3 or the NS3/NS4A 
complex. For example, high resolution X-ray 
diffraction data collected from crystals exposed to 
different types of solvent allows the determination of 
where each type of solvent molecule resides. Small 
molecules that bind tightly to those sites can then be 
designed and synthesized and tested for their NS3 
inhibition activity. 
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All of the complexes referred to above may be 
studied using well-known X-ray diffraction techniques 
and may be refined versus 1.5-3 A resolution X-ray data 
to an R value of about 0.20 or less using computer 
software, such as X-PLOR [Yale University, ©1992, 
distributed by Molecular Simulations, Inc.; see, e.g., 
Blundell & Johnson, supra ; Meth. Enzymol ., vol. 114 & 
115, H. W. Wyckoff et al., eds., Academic Press 
(1985)]. This information may thus be used to optimize 
known NS3 inhibitors, and more importantly, to design 
new NS3 inhibitors. 

In order that this invention be more fully 
understood, the following examples are set forth. 
These examples are for the illustrative purposes only 
and are not to be construed as limiting the scope of 
this invention in any way. 

EXAMPLE 1 
Expression and Purification of tNS3 
The truncated NS3 serine protease domain 

(tNS3) was cloned from a cDNA of the hepatitis C virus 
H strain [Grakoui, A. et al., "Expression and 
Identification of Hepatitis C Virus Polyprotein 
Cleavage Products", J. Virol. , 67, pp. 1385-1395 

(1993)]. The first 181 amino acids of NS3 (residues 
1027-1207 of the viral polyprotein) have been shown to 
contain the serine protease domain of NS3 that 
processes all four downstream sites of the HCV 
polyprotein [Lin, C, et al., Hepatitis C Virus KS3 
Serine Proteinase: Trans-Cleavage Requirements and 
Processing Kinetics", J. Virol. 68, pp. 8147-8157 
(1994b)], so we expressed a (His) 6 -fusion protein based 
on this tNS3. The plasmid pET-BS (+) /HCV/T7-NS3i ei-His 
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was derived from pTM3/HCV/1027-1207 (NS3]. 8 D (Id.), by 
using polymerase chain reaction to introduce epitope 
tags and new restriction sites. A T7-tag (ASMTGGQQMG) , 
from the N-terminus of the gene 10 protein of the T7 
bacteriophage 

[Tsai, D.E . et al., "in Vitro Selection of an RNA 
Epitope Immunologically Cross-Reactive With a Peptide", 
Proc. Natl. Acad. Sci. USA. 89, pp. 8864-8868 (1992)], 
was placed at the N-terminus of the tNS3 domain. Two 
linker residues (GS) were placed at the tNS3 C- 
terminus, followed by the (His) 6 -tag. E.coli 
JM109(DE3) cells, freshly transformed with the pET- 
BS(+)/HCV/T7-NS3i8i-His plasmid, were grown at 37 "c in 
complex media supplement with 100 ug/ml ampicillin, in 
15 a 10 L fermentor (Braun) . When the cell density 

reached an OD 6 00 of 3-4 the temperature of the culture 
was rapidly reduced to 30 °C, and induction was 
immediately initiated by the addition of 1 mM IPTG. 
Cells were harvested at 2 h post-induction, and flash 
frozen at -70 °C prior to purification. 

The tNS3 was purified from the soluole 
fraction of the recombinant E.coli lysates as follows, 
with all procedures being performed at 4 °C unless 
stated otherwise. Cell paste (75-100g) was resuspended 
in 15 volumes of 50 mM HE PES, 0.3 M NaCl, 10? glycerol, 
0.1% Ji-octyl glucoside, 2 mM fl-mercaptoethanol, P H 8.0. 
Cells were ruptured using a microf luidizer and the 
homogenate was clarified by centrif ugation at 100,000 x 
g for 30 min. The supernatant was brought to 50 mM 
HEPES, 20 mM imidazole, 0.3 M NaCl, 27.5% glycerol, 
0.1% fi-octyl- glucoside, 2 mM fl-mercaptoethanol, pH 
8.0, and applied at 1.0 ml/min to a 7.0 ml Ni-Agarose 
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affinity column, equilibrated in the same buffer. 
After loading, the column was washed with 10-15 volumes 
of equilibration buffer and the bound proteins were 
eluted with equilibration buffer containing 0.35 M 
imidazole. The protein was then size-fractionated on 
two columns in series (each 2.6 cm x 90 cm) packed with 
Pharmacia high resolution S100 resin and equilibrated 
with 25 mM HEPES, 0.3M NaCl, 10% glycerol, 0.1% fl- 
octylglucoside, 2 mM B-mercaptoethanol, pH 8.0. The 
tNS3 fractions, identified by SDS-PAGE, were pooled and 
concentrated to 1 mg/ml using a Amicon Centriprep-10, 
and stored at -70°C. The tNS3 was thawed slowly on ice 
and the NS4A peptide (dissolved in the size-exclusion 
chromatography buffer) was added at a tNS3 : NS4A-peptide 
15 molar ratio of 1:2. The sample was then diluted 2.5- 
fold with 15 mM MES, 0.5 M NaCl, 20 mM B- 
mercaptoethanol, pH6.5, and concentrated to -2 ml (~ 2 
mg/ml) by ultrafiltration. The sample was then "diluted 
2-fold with the pH 6.5 buffer and concentrated again to 
-2 ml. This dilution process was repeated until it 
gave a >40-fold dilution of the original buffer 
constituents. The protein sample was then concentrated 
to 13.0 mg/ml and centrifuged at -300,000 x g for 20 
min at 4 °C. Concentrations of the pure tN53 and 
25 tNS3/4A complex were determined by uv absorption 

spectroscopy, using a molar absorption coefficient 
(A 2 80) of 17,700 M-l-cm-1. 
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EXAMPLE 2 

30 4A Peptide Synthesis and Purification 

The HCV NS4A peptide was synthesized to span 
residues Gly21 to Pro39 of the viral cofactor (residues 
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1678 to 1696 of the HCV polyprotein) , which 
incorporates the essential region reported to be 
essential for NS3 stimulation [Lin, c. et al. "A 
Central Region in the Hepatitis C Virus NS4A Protein 
Allows Formation of an Active NS3-NS4A Serine 
Proteinase Complex In Vivo and In Vitro ", J. Virol. 69, 
pp. 4373-4380 (1995)]. Lysine residues were added to 
the termini to assist aqueous solubility, and a serine 
residue was substituted for Cys22 (residue 1679 of the 
polyprotein of the HCV H strain) . The peptide (H- 
KKGSWIVGRIVLSGKPAIIPKK-OH • TFA salt) was prepared by 
the solid-phase peptide synthesis (Applied Biosystems 
433A) beginning with Na-Fmoc, N e -Boc-Lys Wang resin. 
N a -Fmoc-protected amino acids were added sequentially 
15 using HBTU (2- (lH-benzotriazol-l-yl) 1, 1 , 3, 3- 

tetramethyluronium hexaf luorophosphate) with HOBt (1- 
hydroxybenzotriazole hydrate) as coupling agents in N- 
methylpyrrolidinone. Cleavage from the resin and 
global deprotection were accomplished with 95% 
trifluoroacetic acid and 5% water at room temperature 
for 1.5 hr (15 ml/ g resin). The peptide was purified 
by preparative HPLC on a Waters Delta Pak C18, 15 urn, 
30CA column (30 mm x 300 mm) eluting with a linear 
gradient of acetonitrile (15-40%) in 0.1% aqueous 
25 trifluoroacetic acid over 35 min (flow rate of 22 

ml/min) . Peptide purity was confirmed by analytical 
HPLC. The sequence was confirmed by direct N-terminal 
sequence analysis and matrix-assisted laser desorption 
mass spectrometry (Kratos MALDI I), which showed the 
30 correct (M + H) + and (M + Na) + molecular ions. 
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EXAMPLE 3 



Crystallization and Data Collection 
Crystals of the tNS3/NS4A complex were grown 
by hanging-drop vapor diffusion over a reservoir of 0.1 
M MES, 1.8 M NaCl, 0.1 M sodium/potassium phosphate, 10 
mM A-mercaptoethanol, pH 6.5. The crystals grew over 
the course of 2-3 weeks, to final dimensions of about 
0.1 x 0.1 x 0.25 mm. The rhombohedral crystals used in 
this study belonged to space group R32, with unit cell 
dimensions a=b=225.0A, and c=75.5A, and contained two 
tNS3/NS4A complexes per asymmetric unit. 

Statistics for data collection, heavy atom 
refinement, and crystallographic refinement are given 
in Table 1. All heavy atom soaks were done in hanging- 
15 drops over the same reservoir as used for 

crystallization. Crystals were transferred to a 
stabilizing solution (50 mM MES, 2.0 M NaCl, 0.1 M 
sodium/potassium phosphate, 10 mM fl-mercaptoethanol, 
and 20% glycerol, pH 6.2) and then frozen in a dry 
nitrogen gas stream at 100 K (Molecular Structure 
Corp., Houston, TX) for data collection. Data was 
acquired by oscillation photography on a Rigaku R-AXIS 
I1C phosphor imaging area detector mounted on a Rigaku 
RU200 rotating anode generator (MSC) , operating at 50kV 
25 and 100mA. Measured intensities were integrated, 

scaled, and merged using the HKL software package (Z. 
Otwinowski and W. Minor) . 



EXAMPLE 4 

Phasing, Model Building and Refinement 
Heavy atom positions were located by 
inspection and confirmed with difference Fourier 
syntheses. Heavy atom parameters were refined and 
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phases computed to 3.1A using the program PHASES 
[Furey, W. and Swaminathan, S. "PHASES- 95: a program 
package for the processing and analysis of diffraction 
data from macromolecules", Meth. Enzymol . . (1996) . MIR 
phases were improved and extended to 2.7A by cycles of 
solvent flattening [Wang, B.C., "Resolution of Phase 
Ambiguity in Macromolecular Crystallography", Methods 
in Enzymol. 115, pp. 90-112 (1985) ] combined with 
histogram matching [Zhang, K.Y.J, and Main, P., "The 
Use of Sayre's Equation With Solvent Flattening and 
Histogram Matching for Phase Extension and Refinement 
of Protein Structures", Acta Crystalloar . . A4 6, pp. 
377-381 (1990)] using the CCP4 crystallographic package 
(Collaborative Computation Project, 1994) . The 
15 resulting electron density map displayed nearly 

continuous density for the protein backbone as well as 
strong side chain density. Approximately 80% of the 
model could be unambiguously built into this map 
(QUANTA 4.1, Molecular Simulations), and a single round 
of simulated annealing refinement in X-PLOR [Brunger, 
A. T., "X-PLOR: A System for X-Ray Crystallography and 
NMR", New Haven, Connecticut: Department of Molecular 
Biophysics and Biochemistry, Yale University (1993)] 
brought the R-factor to 29% and free R value to 33% 
25 [Brunger, A. T., "Free R Value: A Novel Statistical 
Quantity for Assessing the Accuracy of Crystal 
Structures", Nature , 355, pp. 472-475 (1992)]. The 
remainder of the model was built and refined in several 
steps, by first extending the resolution to 2.5A and 
then adding well-ordered water molecules. A final 
round of positional and individual temperature factor 
refinement brought the R-factor to 21.6% (free R value 
26.1%) for 26,652 reflections between 6.0 and 2.5A 
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(F>lsF). The current model consisted of tNS3 residues 
1055-1206 and NS4A residues 1678-1693 in complex A, and 
tNS3 residues 1028-1206 and NS4A residues 1678-1696 for 
complex B (polyprotein numbering, with 2 zinc atoms 
and 130 water molecules. A Ramachandran plot for the 
final model contained 91% of the residues in the most 
favored regions and 0% in disallowed or generously- 
allowed regions. The rms deviations from ideality were 
C.007A for bond lengths and 1.47° for bond angles. 

While we have described a number of 
embodiments of this invention, it is apparent that our 
basic examples may be altered to provide other 
embodiments which utilize the products and processes of 
this invention. Therefore, it will be appreciated that 
the scope of this invention is to be defined by the 
appended claims rather than by the specific embodiments 
which have been represented by way of example. 



WO 98/1 1134 



PCT/US97/16182 



- 30 - 

CLAIMS 

We claim: 

1. A composition comprising a HCV NS3-like 
5 polypeptide complexed with a NS4A-like peptide. 

2. The composition according to claim 1, 
wherein the HCV NS3-like polypeptide is a NS3 protease 
domain polypeptide or a NS3 protease domain-like 

10 polypeptide. 

3. The composition according to claim 1, 
wherein the HCV NS3-like polypeptide is tNS3. 

15 4 « The composition according to any one of 

claims 1 to 3, wherein the NS4A-like peptide is H- 
KKGSWTVGRIVLSGKPAI I PKK-OH . 

5. A crystal comprising a HCV NS3-like 
20 polypeptide complexed with a NS4A-like peptide. 

6. The crystal according to claim 5, wherein 
the HCV NS3-like polypeptide is a NS3 protease domain 
polypeptide or a NS3 protease domain-like polypeptide. 



25 



30 



7. The crystal according to claim 5, wherein 
the HCV NS3-like polypeptide is tNS3. 

8. The crystal according to any one of 
claims i to 3, wherein the NS4A-like peptide is H- 
KKGSWI VGRIVLSGKPAI I PKK-OH . 
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9. The crystal according to claim 5, 
additionally comprising an inhibitor of HVC NS3. 

10. A machine-readable data storage medium, 

5 comprising a data storage material encoded with machine 
readable data, wherein the data is defined by the 
structure coordinates of a tNS3/sNS4A complex according 
to Figure 3, or a homologue of said complex, wherein 
said homologue comprises backbone atoms that have a 
10 root mean square deviation from the backbone atoms of 
the complex of not more than 1.5A 

11. The machine-readable data storage 
medium, according to claim 10, wherein said molecule or 

15 molecular complex is defined by the set of structure 

coordinates for tNS3/sNS4A according to Figure 3, or a 
homologue of said molecule or molecular complex, said 
homologue having a root mean square deviation from the 
backbone atoms of said amino acids of not more than 1.5 

20 A. 

12. A machine-readable data storage medium 
comprising a data storage material encoded with a first 
set of machine readable data comprising a Fourier 

25 transform of at least a portion of the structural 
coordinates for tNS3/sNS4A according to Figure 3; 
which, when combined with a second set of machine 
readable data comprising an X-ray diffraction pattern 
of a molecule or molecular complex of unknown 

30 structure, using a machine programmed with instructions 
for using said first set of data and said second set of 
data, can determine at least a portion of the structure 
coordinates corresponding to the second set of machine 
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readable data, said first set of data and said second 
set of data. 

13. A method of obtaining structural 

5 information about a molecule or a molecular complex of 
unknown structure by using the structure coordinates 
set forth in Figure 3, comprising the steps of: 

a. generating X-ray diffraction data from 
said crystallized molecule or molecular complex; 
10 b - applying at least a portion of the 

structure coordinates set forth in Figure 3 to said 
X-ray diffraction pattern to generate a 
three-dimensional electron density map of at least a 
portion of the molecule or molecular complex. 

15 

14. The method according to claim 13, 
wherein the molecule or molecular complex of unknown 
structure comprises a polypeptide selected from a NS3- 
like polypeptide in complex with a NS4A-like peptide. 
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FIGURE 3 
tNS3 COORDINATES (Complex A) 
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50.591 1.00 25.76 
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50.566 1.00 26.31 
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ATOM 2425 C ARG 1206 80.030 74.780 24.404 1.00 31.71 

ATOM 2426 O ARG 1206 80.163 75.405 25.487 1.00 33.75 
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ATOM 2593 


CZ 


ARG 


1685 


95.485 


51.487 


40.048 


1.00 


27.79 


ATOM 2594 


NH1 


ARG 


1685 


95.815 


52.632 


39.474 


1.00 


27.86 


ATOM 2595 


NH2 


ARG 


1685 


96.404 


50.557 


40.224 


1.00 


29.00 


ATOM 2596 


C 


ARG 


1685 


91.113 


54.636 


43.185 


1.00 


20.01 


ATOM 2597 


O 


ARG 


1685 


91.404 


54.035 


44.218 


1.00 


22.88 


ATOM 2598 


N 


ILE 


1686 


89.882 


54.703 


42.697 


1.00 


18.51 


ATOM 2599 


CA 


ILE 


1686 


88.745 


54.066 


43.346 


1.00 


17.38 


ATOM 2600 


CB 


ILE 


1686 


87.823 


55.120 


43.985 


1.00 


15.63 


ATOM 2601 


CG2 


ILE 


1686 


86.464 


54.533 


44.292 


1.00 


14.38 


ATOM 2602 


CG1 


ILE 


1686 


88.496 


55.665 


45.251 


1.00 


14.48 


ATOM 2603 


CD1 


ILE 


1686 


87.740 


56.758 


45.935 


1.00 


14.17 


ATOM 2604 


C 


ILE 


1686 


88.043 


53.247 


42.268 


1.00 


17.71 


ATOM 2605 


O 


ILE 


1686 


87.699 


53.769 


41.205 


1.00 


17.73 


ATOM 2606 


N 


VAL 


1687 


87.945 


51.940 


42.484 


1.00 


16.52 


ATOM 2607 


CA 


VAL 


1687 


87.325 


51.073 


41.493 


1.00 


14.59 


ATOM 2608 


CB 


VAL 


1687 


88.152 


49.810 


41.237 


1.00 


13.39 


ATOM 2609 


CG1 


VAL 


1687 


87.509 


48.980 


40.145 


1.00 


11.21 


ATOM 2610 


CG2 


VAL 


1687 


89.564 


50.184 


40.842 


1.00 


11.10 


ATOM 2611 


C 


VAL 


1687 


85.922 


50.681 


41.874 


1.00 


15.40 


ATOM 2612 


O 


VAL 


1687 


85.695 


50.067 


42.916 


1.00 


17.30 


ATOM 2613 


N 


LEU 


1688 


84.990 


51.081 


41.016 


1.00 


16.16 


ATOM 2614 


CA 


LEU 


1688 


83.566 


50.827 


41.171 


1.00 


15.94 


ATOM 2615 


CB 


LEU 


1688 


82.792 


52.003 


40.578 


1.00 


14.08 


ATOM 2616 


CG 


LEU 


1688 


82.936 


53.300 


41.365 


1.00 


15.70 


ATOM 2617 


CD1 


LEU 


1688 


82.61 1 


54.51 1 


40.519 


1.00 


15.92 


ATOM 2618 


CD2 


LEU 


1688 


82.026 


53.212 


42.559 


1.00 


18.17 


ATOM 2619 


C 


LEU 


1688 


83.129 


49.532 


40.481 


1.00 


17.97 


ATOM 2620 


O 


LEU 


1688 


82.209 


48.856 


40.939 


1.00 


20.09 


ATOM 2621 


N 


SER 


1689 


83.826 


49.174 


39.406 


1.00 


20.02 


ATOM 2622 


CA 


SER 


1689 


83.518 


47.982 


38.61 1 


1.00 


21.75 


ATOM 2623 


CB 


SER 


1689 


84.253 


48.060 


37.266 


1.00 


23.96 


ATOM 2624 


OG 


SER 


1689 


85.605 


48.467 


37.436 


1.00 


27.29 


ATOM 2625 


C 


SER 


1689 


83.749 


46.603 


39.243 


1.00 


21.43 


ATOM 2626 


O 


SER 


1689 


83.290 


45.596 


38.700 


1.00 


23.26 


ATOM 2627 


N 


GLY 


1690 


84.465 


46.547 


40.364 


1.00 


21.24 


ATOM 2628 


CA 


GLY 


1690 


724 


45.277 


41.015 


1.00 


19.68 
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ATOM 2629 


C 


GLY 


1690 


83.431 


44.657 


41.500 


1.00 


21.01 


ATOM 2630 


O 


GLY 


1690 


82.452 


45.367 


41.741 


1.00 


20.02 


ATOM 2631 


N 


LYS 


1691 


83.419 


43.334 


41.618 


1.00 


22.40 


ATOM 2632 


CA 


LYS 


1691 


82.245 


42.596 


42.075 


1.00 


25.78 


ATOM 2633 


CB 


LYS 


1691 


81.621 


41.838 


40.901 


1.00 


30.22 


ATOM 2634 


CG 


LYS 


1691 


81.068 


42.740 


39.813 


1.00 


38.96 


ATOM 2635 


CD 


LYS 


1691 


80.651 


41.947 


38.575 


1.00 


46.14 


ATOM 2636 


CE 


LYS 


1691 


80.270 


42.880 


37.405 


1.00 


49.42 


ATOM 2637 


NZ 


LYS 


1691 


79.814 


42.127 


36.186 


1.00 


50.87 


ATOM 2638 


C 


LYS 


1691 


82.705 


41.610 


43.142 


1.00 


25.02 


ATOM 2639 


O 


LYS 


1691 


83.885 


41.254 


43.176 


1.00 


26.26 


ATOM 2640 


N 


PRO 


1692 


81.796 


41.167 


44.031 


1.00 


23.86 


ATOM 2641 


CD 


PRO 


1692 


80.365 


41.516 


44.115 


1.00 


25.36 


ATOM 2642 


CA 


PRO 


1692 


82.157 


40.217 


45.088 


1.00 


23.51 


ATOM 2643 


CB 


PRO 


1692 


80.801 


39.691 


45.542 


1.00 


23.49 


ATOM 2644 


CG 


PRO 


1692 


79.954 


40.91 1 


45.451 


1.00 


24.91 


ATOM 2645 


C 


PRO 


1692 


83.027 


39.095 


44.548 


1.00 


23.24 


ATOM 2646 


O 


PRO 


1692 


82.724 


38.506 


43.515 


1.00 


25.88 


ATOM 2647 


N 


ALA 


1693 


84.136 


38.831 


45.219 


1.00 


21.89 


ATOM 2648 


CA 


ALA 


1693 


85.035 


37.787 


44.780 


1.00 


21.26 


ATOM 2649 


CB 


ALA 


1693 


86.173 


38.383 


43.989 


1.00 


21.74 


ATOM 2650 


C 


ALA 


1693 


85.568 


37.089 


46.000 


1.00 


22.42 


ATOM 2651 


O 


ALA 


1693 


85.708 


37.705 


47.055 


1.00 


24.98 


ATOM 2652 


N 


ILE 


1694 


85.810 


35.791 


45.879 


1.00 


22.49 


ATOM 2653 


CA 


ILE 


1694 


86.342 


35.020 


46.989 


1.00 


21.65 


ATOM 2654 


CB 


ILE 


1694 


86.052 


33.524 


46.794 


1.00 


20.10 


ATOM 2655 


CG2 


ILE 


1694 


86.718 


32.714 


47.873 


1.00 


22.22 


ATOM 2656 


CG1 


ILE 


1694 


84.539 


33.293 


46.829 


1.00 


20.16 


ATOM 2657 


CD1 


ILE 


1694 


84.133 


31.872 


46.593 


1.00 


19.23 


ATOM 2658 


C 


ILE 


1694 


87.832 


35.31 1 


46.999 


1.00 


22.06 


ATOM 2659 


O 


ILE 


1694 


88.506 


35.118 


45.988 


1.00 


25.81 


ATOM 2660 


N 


ILE 


1695 


88.336 


35.837 


48.109 


1.00 


21.38 


ATOM 2661 


CA 


ILE 


1695 


89.748 


36.181 


48.194 


1.00 


19.73 


ATOM 2662 


CB 


ILE 


1695 


90.112 


36.812 


49.557 


1.00 


16.79 


ATOM 2663 


CG2 


ILE 


1695 


91.550 


37.296 


49.533 


1.00 


16.53 


ATOM 2664 


CG1 


ILE 


1695 


89.209 


38.009 


49.853 


1.00 


11.03 


ATOM 2665 


CD1 


ILE 


1695 


89.480 


38.658 


51.192 


1.00 


8.12 


ATOM 2666 


C 


ILE 


1695 


90.596 


34.943 


47.947 


1.00 


22.82 


ATOM 2667 


O 


ILE 


1695 


90.498 


33.954 


48.669 


1.00 


22.97 


ATOM 2668 


N 


PRO 


1696 


91.407 


34.967 


46.886 


1.00 


26.75 


ATOM 2669 


CD 


PRO 


1696 


91.564 


36.085 


45.940 


1.00 


28.08 


ATOM 2670 


CA 


PRO 


1696 


92.279 


33.851 


46.520 


1.00 


30.50 


ATOM 2671 


CB 


PRO 


1696 


93.080 


34.418 


45.354 


1.00 


29.24 


ATOM 2672 


CG 


PRO 


1696 


92.138 


35.395 


44.733 


1.00 


28.73 


ATOM 2673 


C 


PRO 


1696 


93.207 


33.490 


47.662 


1.00 


36.92 


ATOM 2674 


O 


PRO 


1696 


93.659 


34.371 


48.402 


1.00 


37.43 


ATOM 2675 


N 


LYS 


1697 


93.454 


32.191 


47.824 


1.00 


42.85 


ATOM 2676 


CA 


LYS 


1697 


94.356 


31.692 


48.863 


1.00 


46.97 


ATOM 2677 


CB 


LYS 


1697 


93.938 


30.284 


49.314 


1.00 


48.62 


ATOM 2678 


CG 


LYS 


1697 


92.480 


30.143 


49.712 


1.00 


52.72 


ATOM 2679 


CD 


LYS 


1697 


92.325 


29.631 


51.139 


1.00 


54.54 



WO 98/11134 



FIGURE 3 (CONT.) 



PCT/US97/16182 



ATOM 2680 


CE 


LYS 


1697 


90.885 


29.167 


51.394 


1.00 


56.49 


ATOM 2681 


NZ 


LYS 


1697 


90.578 


27.847 


50.749 


1.00 


59.20 


ATOM 2682 


C 


LYS 


1697 


95.789 


31.645 


48.31 1 


1.00 


48.94 


ATOM 2683 


0 


LYS 


1697 


96.588 


32.540 


48.661 


1.00 


50.95 



ZINC ION COORDINATES 



Atom 
Type 
ATOM 2684 
ATOM 2685 



Resid 

ZN 

ZN 



# 

ZN 
ZN 



Y 

901 
902 



Z 

71 .089 
70.157 



OCC 
51 .399 
56.302 



B 

51.975 
36.264 



1.00 
1.00 



29.52 
32.22 



WATERMOLECULE COORDINATES 



Atom 



Type 


Resid 


# X 


Y 


Z 


OCC 


B 






ATOM 2686 


OH2 


TIP3 


1 


80.188 


51.569 


51.895 


1.00 


14.91 


ATOM 2687 


OH2 


TIP3 


2 


84.436 


50.278 


50.706 


1.00 


27.51 


ATOM 2688 


OH2 


TIP3 


3 


80.360 


48.626 


54.069 


1.00 


26.47 


ATOM 2689 


OH2 


TIP3 


4 


90.933 


33.696 


52.289 


1.00 


23.43 


ATOM 2690 


OH2 


TIP3 


5 


72.460 


27.233 


53.507 


1.00 


26.41 


ATOM 2691 


OH2 


TIP3 


6 


87.098 


37.167 


56.456 


1.00 


35.13 


ATOM 2692 


OH2 


TIP3 


7 


80.196 


47.990 


42.389 


1.00 


32.46 


ATOM 2693 


OH2 


TIP3 


8 


80.634 


37.946 


41.797 


1.00 


15.54 


ATOM 2694 


OH2 


T1P3 


9 


91.083 


45.806 


52.787 


1.00 


17.30 


ATOM 2695 


OH2 


TIP3 


10 


85.712 


53.247 


29.331 


1.00 


28.74 


ATOM 2696 


OH2 


TIP3 


11 


89.054 


50.366 


50.031 


1.00 


18.47 


ATOM 2697 


OH2 


TIP3 


12 


79.767 


46.999 


50.439 


1.00 


24.04 


ATOM 2698 


OH2 


TIP3 


13 


79.409 


63.138 


26.964 


1.00 


33.09 


ATOM 2699 


OH2 


TIP3 


14 


79.565 


49.986 


49.688 


1.00 


26.32 


ATOM 2700 


OH2 


TIP3 


15 


88.908 


25.285 


68.030 


1.00 


19.71 


ATOM 2701 


OH2 


TIP3 


16 


59.074 


46.155 


52.416 


1.00 


29.76 


ATOM 2702 


OH2 


TIP3 


17 


83.621 


37.213 


41.479 


1.00 


44.52 


ATOM 2703 


OH2 


TIP3 


18 


59.694 


46.148 


55.800 


1.00 


34.49 


ATOM 2704 


OH2 


TIP3 


19 


94.223 


75.564 


42.730 


1.00 


38.61 


ATOM 2705 


OH2 


TIP3 


20 


66.565 


41.471 


68.855 


1.00 


40.67 


ATOM 2706 


OH2 


TIP3 


21 


94.230 


39.125 


44.532 


1.00 


37.61 


ATOM 2707 


OH2 


TIP3 


22 


63.801 


32.294 


63.606 


1.00 


24.75 


ATOM 2708 


OH2 


TIP3 


23 


85.565 


24.399 


71.908 


1.00 


34.81 


ATOM 2709 


OH2 


TIP3 


24 


78.876 


64.139 


29.678 


1.00 


39.51 


ATOM 2710 


OH2 


TIP3 


25 


82.850 


28.849 


47.546 


1.00 


25.28 


ATOM 2711 


OH2 


TIP3 


28 


69.172 


69.500 


51.903 


1.00 


30.62 


ATOM 2712 


OH2 


TIP3 


29 . 


66.988 


51.499 


56.187 


1.00 


30.20 


ATOM 2713 


OH2 


TIP3 


30 


91.719 


42.057 


60.299 


1.00 


31.11 


ATOM 2714 


OH2 


TIP3 


31 


77.543 


29.196 


41.350 


1.00 


55.87 


ATOM 2715 


OH2 


TIP3 


32 


84.613 


22.791 


54.244 


1.00 


64.87 


ATOM 2716 


OH2 


TIP3 


33 


71.143 


49.227 


43.011 


1.00 


36.07 


ATOM 2717 


OH2 


TIP3 


34 


72.527 


52.824 


53.213 


1.00 


5.56 


ATOM 2718 


OH2 


TIP3 


35 


»0 116 


46.082 


38.809 


1.00 


44.35 
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ATOM 


2719 


OH2 


TIP3 


36 


65.304 


58.306 


50.404 


1.00 


50.48 


ATOM 


2720 


OH2 


TIP3 


38 


77.283 


35.181 


38.894 


1.00 


36.34 


ATOM 


2721 


OH2 


TIP3 


39 


67.275 


62.937 


39.441 


1.00 


39.30 


ATOM 


2722 


OH2 


TIP3 


40 


95.912 


77.587 


39.202 


1.00 


37.03 


ATOM 


2723 


OH2 


TIP3 


41 


87.028 


46.815 


50.178 


1.00 


14.09 


ATOM 


2724 


OH2 


TIP3 


42 


70.710 


55.468 


34.204 


1.00 


7.82 


ATOM 


2725 


OH2 


TIP3 


43 


87.279 


67.965 


49.287 


1.00 


23.95 


ATOM 


2726 


OH2 


TIP3 


44 


90.719 


39 572 

W^ W • W f 4W 


46 110 

» w« ■ 1 w 


1 00 

1 • w w 


29 46 


ATOM 


2727 


OH2 


TIP3 


45 


88.956 


32.992 


50.562 


1 00 

1 .ww 


26.01 


ATOM 


2728 


OH2 


TIP3 


46 


99.309 


29.693 


67.462 


1 00 

1 .WW 


19.97 


ATOM 


2729 


OH2 


TIP3 


47 


99.173 


61.794 


37.290 


1.00 


31.61 


ATOM 


2730 


OH2 


TIP3 


48 


106.963 


58.636 


47,340 


1 00 


36 68 

WW. WW 


ATOM 


2731 


OH2 


TIP3 


49 


85.142 


35.133 


43.034 


1 00 

• .WW 


19 23 


ATOM 


2732 


OH2 


TIP3 


50 


71.616 


67.146 


35.073 


1.00 


36.90 


ATOM 


2733 


OH2 


TIP3 


51 


75.865 


23 589 


68 938 

W W« WWW 


1 00 

1 .WW 


31 60 

W 1 .WW 


ATOM 


2734 


OH2 


TIP3 


52 


84.026 


47.614 


42 760 

« W W 


1 00 

I .WW 


26 85 


ATOM 


2735 


OH2 


TIP3 


53 


99.401 


56.479 


39 758 

Ww. r w W 


1 00 

1 .WW 


*C • ■ ww 


ATOM 


2736 


OH2 


TIP3 


54 


68.040 


27.725 


53.497 


1 00 

■ .WW 


43 99 

w w 


ATOM 


2737 


OH2 


TIP3 


55 


78.583 


39.197 


40.332 


1.00 


24 68 

WW 


ATOM 


2738 


OH2 


TIP3 


56 


92.451 


75.069 


27.71 1 


1.00 


44 43 


ATOM 


2739 


OH2 


TIP3 


57 


89.623 


49.854 


35.111 


1 00 

1 .wW 


22 90 

fafc. . WW 


ATOM 


2740 


OH2 


TIP3 


58 


95.616 


73.094 


31.597 


1.00 


27 62 


ATOM 


2741 


OH2 


TIP3 


59 


86.253 


47.246 


59 568 

ww . www 


1 00 

1 • WW 


24 37 


ATOM 


2742 


OH2 


TIP3 


60 


98.881 


56 884 

W^W .W^rf^T 


37 033 

w t .Www 


1 00 

I .WW 


32 28 


ATOM 


2743 


OH2 


TIP3 


61 


107.913 


59.670 

w^w • w » 


50 434 

W W.^W^T 


1 00 

1 .WW 


40 73 


ATOM 


2744 


OH2 


TIP3 


62 


63.410 


58.104 


43.219 


1 00 

1 .WW 


29 77 

bW. / / 


ATOM 


2745 


OH2 


TIP3 


63 


99.184 


62.943 


48.341 


1.00 


52 93 

wf ■ . w W 


ATOM 


2746 


OH2 


TIP3 


64 


92.483 


47.893 


40.686 


1 00 

■ .WW 


33 31 

WW* W 1 


ATOM 


2747 


OH2 


TIP3 


65 


76.471 


22.799 


61.366 


1 00 

1 .WW 


33 07 

Ww. w t 


ATOM 


2748 


OH2 


TIP3 


66 


102.733 


70.701 


43.239 


1 00 

1 .WW 


52 51 


ATOM 


2749 


OH2 


TIP3 


67 


94.669 


46.032 


46 815 

~ W. W 1 w 


1 00 

1 .WW 


^1 75 

w 1 . / w 


ATOM 


2750 


OH2 


TIP3 


68 


100.962 

■ VV( Wis 


67.281 


48 137 

~ w • 1 w / 


1 00 


52 13 


ATOM 


2751 


OH2 


TIP3 


69 


76.478 


37.405 


40.145 


1.00 


25 79 

bW. ■ W 


ATOM 


2752 


OH2 


TIP3 


70 


90.165 


47.004 


58.770 


1.00 


49.51 


ATOM 


2753 


OH2 


TIP3 


71 


89.912 


38.696 


43.283 


1.00 


41 39 

1 1 .Ww 


ATOM 


2754 


OH2 


TIP3 


72 


78.867 


77.910 


30.979 


1.00 


43 29 


ATOM 


2755 


OH2 


TIP3 


73 


105.807 


71.061 


33.955 


1.00 


35.91 


ATOM 


2756 


OH2 


TIP3 


74 


94.956 


46.168 


41 .472 


1.00 

1 «ww 


32 83 

wfatVw 


ATOM 


2757 


OH2 


TIP3 


75 


81.755 


20.284 


64.565 


1 00 

1 .WW 


36 36 

WW. Ww 


ATOM 


2758 


OH2 


TIP3 


76 


83.777 


36.376 


73.095 


1 00 

1 .WW 


35 68 

Ww. WW 


ATOM 


2759 


OH2 


TIP3 


77 


90.384 


42.253 


62.669 


1.00 


45 35 

~w. Ww 


ATOM 


2760 


OH2 


TIP3 


78 


88.037 


29.259 


75.147 


1 00 

1 .WW 


56 66 

WW. WW 


ATOM 


2761 


OH2 


TIP3 


79 


101.859 


65.470 


26.683 


1 00 

■ . ww 


39 64 

Ww t w~ 


ATOM 


2762 


OH2 


T1P3 


80 


71.347 


27.770 


73 090 

/ W.WwW 


1 00 

1 .WW 


46 49 


ATOM 


2763 


OH2 


TIP3 


81 


105.055 


72.336 


36.481 


1 00 

1 .WW 


34 57 

w^T* w f 


ATOM 


2764 


OH2 


TIP3 


82 


89.487 


32.603 


43.952 


1.00 


55.98 


ATOM 


2765 


OH2 


TIP3 


83 
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Figure 4 
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